Goto

Collaborating Authors

 natural language processing pipeline


Going Down the Natural Language Processing Pipeline

#artificialintelligence

Communication plays a big part in our everyday lives. We talk to different people in different languages, but what about communicating with technology? Nowadays, everyone has some sort of device, and we often use it to find answers to our questions. Such as asking Siri, "where can I find the nearest sushi place?" we are verbally asking a question/making a statement. But here's the thing, computers don't just speak English; they are written in complex code with totally different syntax than we speak.


A Natural Language Processing Pipeline for Detecting Informal Data References in Academic Literature

Lafia, Sara, Fan, Lizhou, Hemphill, Libby

arXiv.org Artificial Intelligence

Discovering authoritative links between publications and the datasets that they use can be a labor-intensive process. We introduce a natural language processing pipeline that retrieves and reviews publications for informal references to research datasets, which complements the work of data librarians. We first describe the components of the pipeline and then apply it to expand an authoritative bibliography linking thousands of social science studies to the data-related publications in which they are used. The pipeline increases recall for literature to review for inclusion in data-related collections of publications and makes it possible to detect informal data references at scale. We contribute (1) a novel Named Entity Recognition (NER) model that reliably detects informal data references and (2) a dataset connecting items from social science literature with datasets they reference. Together, these contributions enable future work on data reference, data citation networks, and data reuse.


GitHub - booknlp/booknlp: BookNLP, a natural language processing pipeline for books

#artificialintelligence

The larger and more accurate big model is fit for GPUs and multi-core computers; the faster small model is more appropriate for personal computers. See the table below for a comparison of the difference, both in terms of overall speed and in accuracy for the tasks that BookNLP performs. To explore running BookNLP in Google Colab on a GPU, see this notebook. If using a GPU, install pytorch for your system and CUDA version by following installation instructions on https://pytorch.org. This runs the full BookNLP pipeline; you are able to run only some elements of the pipeline (to cut down on computational time) by specifying them in that parameter (e.g., to only run entity tagging and event tagging, change model_params above to include "pipeline":"entity,event").


Natural Language Processing Pipeline

#artificialintelligence

In this article, I cover the development, or rather, a pipeline of an Natural Language Processing (NLP). Thus, interested parties can benefit from a guiding content in the development stages of the study or project. For those looking for a definition, NLP is a sub-area of machine learning that works with natural language, whether dealing with text or audio. In technical terms, it studies the capabilities and limitations of machines to understand human language. A pratical application now is that while the text is written in English language, you probably are reading it in your native language, if not English.

  acquisition, natural language processing pipeline, nlp, (13 more...)

How Duolingo uses AI in every part of its app

#artificialintelligence

Language learning has surged during the pandemic. Duolingo, which is synonymous with gamified language learning, saw its fastest growth period this March, with a 101% global increase in new users. From those who simply have more time on their hands to students trying to keep up during the pandemic school year, the app is a huge boon. All that extra data isn't going to waste -- because Duolingo invested early in AI, the app keeps getting better as it grows beyond the 30 million monthly active users reported in December 2019. "One of the things people don't know is that even though Duolingo is very gamified and it just looks very cutesy, we actually record everything you do to try to basically have a model of what you know," Duolingo CEO Luis von Ahn told VentureBeat. We spoke to von Ahn about all the ways Duolingo uses AI and then followed up with the company's research director, Burr Settles, who joined in 2013 (Duolingo was founded in 2012). "We hired this guy named Burr who has a Ph.D. in AI," von Ahn said when describing the company's first foray into AI. "He came in and the idea was'Try to figure out how to use AI to improve Duolingo.'" We've already done deep dives into how Duolingo uses AI to humanize virtual language lessons and to drive its English proficiency tests.


Natural Language Processing Pipeline

#artificialintelligence

If we were asked to build an NLP application, think about how we would approach doing so at an organization. We would normally walk through the requirements and break the problem down into several sub-problems, then try to develop a step-by-step procedure to solve them. Since language processing is involved, we would also list all the forms of text processing needed at each step. If we were asked to build an NLP application, think about how we would approach doing so at an organization. We would normally walk through the requirements and break the problem down into several sub-problems, then try to develop a step-by-step procedure to solve them.